<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html><head><title>Programming in Lua : 20</title>
<link rel="stylesheet" type="text/css" href="pil.css">
</head>
<body>
<p class="warning">
<A HREF="contents.html"><IMG TITLE="Programming in Lua (first edition)" SRC="capa.jpg" ALT="" ALIGN="left"></A>This first edition was written for Lua 5.0. While still largely relevant for later versions, there are some differences.<BR>The third edition targets Lua 5.2 and is available at <A HREF="http://www.amazon.com/exec/obidos/ASIN/859037985X/theprogrammil3-20">Amazon</A> and other bookstores.<BR>By buying the book, you also help to <A HREF="../donations.html">support the Lua project</A>.
</p>
<table width="100%" class="nav">
<tr>
<td width="10%" align="left"><a href="19.3.html"><img border="0" src="left.png" alt="Previous"></a></td>
<td width="80%" align="center"><a href="contents.html"><font face="Helvetica,Arial,sanserif">
<font color="gray">Programming in </font><font color="blue"> Lua</font>
</font></a></td>
<td width="10%" align="right"><a href="20.1.html"><img border="0" src="right.png" alt="Next"></a></td>
</tr>
<tr>
<td width="10%" align="left"></td>
<td width="80%" align="center"><a href="contents.html#P3">Part III. The Standard Libraries</a>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<a href="contents.html#20">Chapter 20. The String Library</a></td>
<td width="10%" align="right"></td></tr>
</table>
<hr/>
<a name="StringLib"><h1>20 &ndash; The String Library</h1></a>

<p>The power of a raw Lua interpreter to manipulate strings is quite limited.
A program can create string literals and concatenate them.
But it cannot extract a substring, check its size,
or examine its contents.
The full power to manipulate strings in Lua
comes from its string library.

<p>Some functions in the string library are quite simple:
<code>string.len(s)</code>
returns the length of a string <code>s</code>.
<code>string.rep(s, n)</code>
returns the string <code>s</code> repeated <code>n</code> times.
You can create a string with 1M bytes (for tests, for instance)
with <code>string.rep("a", 2^20)</code>.
<code>string.lower(s)</code>
returns a copy of <code>s</code> with the
upper-case letters converted to lower case;
all other characters in the string are not changed
(<code>string.upper</code> converts to upper case).
As a typical use, if you want to sort an
array of strings regardless of case,
you may write something like
<pre>
    table.sort(a, function (a, b)
      return string.lower(a) &lt; string.lower(b)
    end)
</pre>
Both <code>string.upper</code> and <code>string.lower</code> follow the current locale.
Therefore, if you work with the European Latin-1 locale,
the expression
<pre>
    string.upper("a&ccedil;&atilde;o")
</pre>
results in <code>"A&Ccedil;&Atilde;O"</code>.

<p>The call <code>string.sub(s,i,j)</code> extracts a piece of the string <code>s</code>,
from the <code>i</code>-th to the <code>j</code>-th character inclusive.
In Lua, the first character of a string has index 1.
You can also use negative indices,
which count from the end of the string:
The index <em>-1</em> refers to the last character in a string,
<em>-2</em> to the previous one, and so on.
Therefore, the call <code>string.sub(s, 1, j)</code> gets a <em>prefix</em> of
the string <code>s</code> with length <code>j</code>;
<code>string.sub(s, j, -1)</code> gets a <em>suffix</em> of the string,
starting at the <code>j</code>-th character
(if you do not provide a third argument,
it defaults to <em>-1</em>,
so we could write the last call as <code>string.sub(s, j)</code>);
and <code>string.sub(s, 2, -2)</code> returns a copy of the string <code>s</code> with
the first and last characters removed:
<pre>
    s = "[in brackets]"
    print(string.sub(s, 2, -2))   -->  in brackets
</pre>

<p>Remember that strings in Lua are immutable.
The <code>string.sub</code> function,
like any other function in Lua,
does not change the value of a string,
but returns a new string.
A common mistake is to write something like
<pre>
    string.sub(s, 2, -2)
</pre>
and to assume that the
value of <code>s</code> will be modified.
If you want to modify the value of a variable,
you must assign the new value to the variable:
<pre>
    s = string.sub(s, 2, -2)
</pre>

<p>The <code>string.char</code> and <code>string.byte</code> functions convert
between characters and their internal numeric representations.
The function <code>string.char</code> gets zero or more integers,
converts each one to a character,
and returns a string concatenating all those characters.
The function <code>string.byte(s, i)</code> returns the internal numeric
representation of the <code>i</code>-th character of the string <code>s</code>;
the second argument is optional, so that a call <code>string.byte(s)</code>
returns the internal numeric representation of the first
(or single) character of <code>s</code>.
In the following examples,
we assume that characters are represented in ASCII:
<pre>
    print(string.char(97))                    -->  a
    i = 99; print(string.char(i, i+1, i+2))   -->  cde
    print(string.byte("abc"))                 -->  97
    print(string.byte("abc", 2))              -->  98
    print(string.byte("abc", -1))             -->  99
</pre>
In the last line, we used a negative index to access
the last character of the string.

<p>The function <code>string.format</code> is a powerful tool when formatting strings,
typically for output.
It returns a formatted version of its variable number of arguments
following the description given by its first argument,
the so-called <em>format</em> string.
The format string has rules similar to those of the <code>printf</code>
function of standard C:
It is composed of regular text and <em>directives</em>,
which control where and how each argument must be placed
in the formatted string.
A simple directive is the character `<code>%</code>&acute; plus a letter that
tells how to format the argument:
`<code>d</code>&acute; for a decimal number, `<code>x</code>&acute; for hexadecimal,
`<code>o</code>&acute; for octal,
`<code>f</code>&acute; for a floating-point number,
`<code>s</code>&acute; for strings, plus other variants.
Between the `<code>%</code>&acute; and the letter,
a directive can include other options,
which control the details of the format,
such as the number of decimal digits of a floating-point number:
<pre>
    print(string.format("pi = %.4f", PI))     --> pi = 3.1416
    d = 5; m = 11; y = 1990
    print(string.format("%02d/%02d/%04d", d, m, y))
      --> 05/11/1990
    tag, title = "h1", "a title"
    print(string.format("&lt;%s>%s&lt;/%s>", tag, title, tag))
      --> &lt;h1>a title&lt;/h1>
</pre>
In the first example, the <code>%.4f</code> means a floating-point number
with four digits after the decimal point.
In the second example, the <code>%02d</code> means a decimal number
(`<code>d</code>&acute;), with at least two digits and zero padding;
the directive <code>%2d</code>, without the zero,
would use blanks for padding.
For a complete description of those directives,
see the Lua reference manual.
Or, better yet, see a C manual,
as Lua calls the standard C libraries to do the hard work here.

<hr/>
<table width="100%" class="nav">
<tr valign="top">
<td width="90%" align="left">
<small>
  Copyright &copy; 2003&ndash;2004 Roberto Ierusalimschy.  All rights reserved.
</small>
</td>
<td width="10%" align="right"><a href="20.1.html"><img border="0" src="right.png" alt="Next"></a></td>
</tr>
</table>


</body></html>

